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Abstract 

Background: Little objective evidence exists regarding what makes a good lecture. Our purpose was to determine 
qualities of radiology review course lectures that are associated with positive audience evaluation. 

Methods: 57 presentations from the Ottawa Resident Review Course (2012) were analyzed by a PGY4 radiology 
resident blinded to the result of audience evaluation. Objective data extracted were: slides per minute, lines of text 
per text slide, words per text slide, cases per minute, images per minute, images per case, number of audience 
laughs, number of questions posed to the audience, number of summaries, inclusion of learning objectives, ending 
on time, use of pre/post-test and use of special effects. Mean audience evaluation scores for each talk from daily 
audience evaluations (up to 60 per talk) were standardized out of 100. Correlation coefficient was calculated 
between continuous variables and audience evaluation scores. Student 7 test was performed on categorical 
variables and audience evaluation scores. 

Results: Strongest positive association with audience evaluation scores was for image quality (r = 0.57) and number 
of times the audience laughed (r = 0.3). Strongest negative association was between images per case and audience 
scores (r = -0.25). Talks with special effects were rated better (mean score 94.3 vs. 87.1, p < 0.001). Talks with the 
highest image quality were rated better (mean score 94.1 vs. 87.5, p < 0.001). Talks which contained a pre/post-test 
were rated better (mean score 92 vs. 87.8, p = 0.004). 

Conclusion: Many factors go into making a great review course lecture. At the University of Ottawa Resident 
Review Course, high quality images, use of special effects, use of pre/post-test and humor were most strongly 
associated with high audience evaluation scores. High image volume per case may be negatively associated with 
audience evaluation scores. 



Background 

Educational presentations in Diagnostic Imaging are often 
crafted through experience, intuition, and based on feed- 
back from previous lectures. Although many articles have 
been written about what goes into creating a great radi- 
ology lecture, as well as lectures in general, these are often 
not based on objective data, but are in the domain of 'ex- 
pert opinion' [1-5]. Littie is known regarding relationship 
of certain lecture variables (e.g. number of cases, number 
of slides) with lecture effectiveness. 
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Many traits have been associated with effective presen- 
tations although have not necessarily been objectively 
studied to determine if they correlate with audience 
evaluation. These would include lectures with clearly 
stated objectives, high quality images, techniques that 
encourage audience participation such as audience ques- 
tioning, as well as strategies to motivate and entertain 
the listener including humor [1]. Other traits such as 
text slides with too many lines per slide and too many 
words per slide have been associated with lower quality 
presentations [2]. Analysis of comments received at a 
National Radiology Continuing Medical Education 
Course demonstrated that poor image quality such as 
images that are too dark or incompletely projected on a 
screen commonly resulted in negative feedback [3]. 
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Radiology resident review courses are common and 
aim to prepare residents for board examinations [6-9]. 
Lecturing at these courses can be demanding and may 
require a more case-intensive style than lectures given at 
other diagnostic imaging courses. The University of 
Ottawa puts on an annual week long resident review 
course that aims to prepare residents for their Canadian 
Radiology Board Examinations [9]. This course was 
started in 2011 and has been attended by more than 500 
individuals over 3 years. In order to continually improve 
course quality, the course directors gather feedback from 
attendees regarding each lecture. 

The purpose of our study is to evaluate which radi- 
ology review course lecture variables are associated with 
positive audience evaluation. 

Methods 

The local research ethics board waived approval for this 
study. Ottawa Hospital Research Ethics Board. 

The 2nd Annual Ottawa Resident Review Course which 
took place from March 25-30, 2012 had a total of 57 pre- 
sentations given by 39 separate speakers; it was attended 
by more than 150 people. The vast majority (>80%) of at- 
tendees were radiology residents from Canadian residency 
programs; the remaining attendees were a combination of 
residents from American radiology residency programs 
and radiologists practicing in Canada (Reference: personal 
communication with Sandra Leslie: May 2013). Video cap- 
ture of the slides and simultaneous audio of the presenters 
was saved. PDF versions of each talk were also saved for 
review. Forty six of the 57 presentations had video fOes 
which could be reviewed (some speakers did not consent 
to recording); all 57 presentations had pdf files available. 
Audience evaluations of each lecture from course at- 
tendees were collected. Lectures were scored on a 1-5 
scale (5 being best) and freeform comments could be 
made. These were standardized to a maximum score of 
100 by adding all scores achieved by a given talk, multiply- 
ing by 20 and then dividing by the total number of 
respondents. 

Data extraction 

The following objective data was collected by reviewing 
the recorded lectures (when available) or the lecture pdf 
files: use of objectives or outline, total slides per minute, 
number of text lines per text slide, number of words per 
text slide, cases per minute, images per minute, images 
per case, number of episodes of audience laughter per 
presentation, number of questions posed to the audience 
per presentation, number of summaries or summary 
slides, use of animation. 'Total slides' was defined as all 
slides contained in the presentation. 'Text slides' was de- 
fined as slides containing only text content and no images. 
Lectures were classified as either 'didactic' or 'unknown 



case presentations'. Lecture were classified as 'unknown 
case presentations' if the majority of cases were initially 
presented without an associated diagnosis. 

Image quality was assessed by reviewing recorded lec- 
tures and scored on a subjective 1-5 scale with 5 being 
best. Higher scores were awarded to talks with images 
that were properly cropped and possessed suitable con- 
trast and clearly demonstrated the relevant findings. Pre- 
sentations with these traits for all of the images were 
scored as 5, for only half were scored as 3 and for none 
were scored as 1. The basis for the framework of the 
image quality evaluation was derived from frequently 
cited image quality criticisms in a prior study of radi- 
ology lectures [3]. 

Data collection was done by (L.C.) a PGY4 radiology 
resident. The data collector was not blinded to the 
speaker identity; this was not feasible due to the fact that 
speakers were recognizable from the audio files. 

Written comments from course evaluations were 
reviewed qualitatively. Commonly recurring comments 
were tabulated. Common occurrence was defined as oc- 
curring more than 3 times. 

Data analysis 

Pearson correlation coefficients were calculated between 
each of the extracted variables with non-dichotomous 
scoring and the presentation evaluation scores by 
attendees. 

For the variables with dichotomous scoring, a student 
f-test analysis was performed (use of objectives/outlines, 
presence of audience laughter, finishing on time, use of 
summaries during the presentation, questions posed to 
the audience, use of pre or post-test, unknown vs. didac- 
tic, use of special effects such as animation). Talks with 
slides per minute in excess of one standard deviation 
(SD) above the mean were compared to the remainder 
of the presentations. For the 55 presentations with im- 
ages, lectures with cases per minute, images per minute 
or images per case one SD above the mean were com- 
pared to the remaining talks. Similarly, of the 44 talks 
on video which also contained images, those achieving 
image quality scores of 5 were compared to talks 
awarded scores of 1 to 4. 

A low score group and a high score group were de- 
fined as talks awarded a standardized feedback score of 
less than one SD or more than one SD relative to the 
mean overall feedback score, respectively. Average values 
for many of the above described parameters were com- 
pared between the high scoring and low scoring groups 
using the Student f-test. Parameters analyzed as a pro- 
portion were defined with 0 being none of the presenta- 
tions in the group possessing a characteristic, 1 being all 
of the presentations possessing a characteristic and 
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Table 1 Recurring comments for HSG presentations and 
LSG presentations 



HSG presentations: 



LSG presentations: 



Lots of cases 


Handout and talk do not match 


Beautiful cases 


Inaccurate or mistakes 


Good pace/speed 


No differentials. No differential diagnosis 




slides. Too fast when discussing differentials 


Clear/organized 


Too slow too fast. 


Labels 


Too quiet. Cannot hear. 


Material resembles exam 


Images bad, too small, too many 




Not enough cases 




Disorganized 




Incomplete 



values in between corresponding to the fraction of talks 
displaying the trait in question. 

Statistical analysis was performed with Microsoft Excel 
(Microsoft Corporation, Redmond Washington, 2013, 
version 15.0.4454.1503). 

Results 

The number of evaluations received for each presenta- 
tion ranged from 26-60 with a mean of 47.8. Presenta- 
tion scores ranged in value from 72 to 97 (out of 100). 
An average score of 87.8 was achieved with a standard 
deviation of 6.5. There were 9 presentations which 
achieved a score of greater than 94.3, defined as the 
"high scoring group". There were 11 presentations that 
received a score of less than 81.3, defined as the "low 
scoring group". Of the 57 presentations, 55 contained 
Diagnostic Imaging (DI) images, such as images from 
MRI, CT, US and plain radiography studies. 

The commonly recurring written qualitative positive 
comments amongst the high scoring group presentations 
and the commonly encountered written negative 

Table 2 Correlation between presentation score and 
various characteristics 



Parameter 


Number of 
presentations 


Correlation 
coefficient 


mage quality score 


44 


0.57 


Number of times audience laughed 


46 


0.3 


Slides per minute 


57 


0.1 


Number summaries during talk 


46 


0.06 


Number of questions for audience 


46 


0.05 


Cases per minute 


55 


-0.0048 


Text lines per text slide 


57 


-0.1 


mages per minute 


55 


-0176 


Words per text slide 


57 


-02 


mages per case 


55 


-025 



Table 3 Comparison of scores based on presence or 
absence of a feature 



Present Average Absent 
score 



Average p value 
score 



Contains 10 
sophisticated 
special effects 

Contains 6 
at least one 
of either a 
pretest or 
posttest 

Audience 1 1 
laughed 
during talk 

Objectives or 30 
outline 

Presentation 30 
was on time 



94.3 



92 



87.1 



Summary 
occurred 
during talk 

Didactic 
lecture vs. 
unknown 
cases 



27 



28 

(didactic) 



34 



40 



35 



27 



16 



19 



87.1 



87.8 



87.5 



87.5 



16 887 
(unknown) 



Questions for 8 
audience 



3.83 X lOe-7 
0.004 

0.07 

0.22 

0.298 

0.34 

0.95 
0.95 



46 talks with video, 44 talks with DI images. 
44 talks with images. 



comments amongst the low scoring group presentations 
are presented in Table 1. 

Correlation between presentation scores and various 
parameters are listed in Table 2. The strongest correl- 
ation was between presentation scores and image quality 
scores, r = 0.57. The second strongest correlation was 
identified between presentation scores and the number 
of times the audience laughed during the talk, r = 0.3. 
The strongest negative correlation was between presen- 
tation scores and 'images per case' r = -0.25. 

Comparison of scores based on various characteristics 
is summarized in Tables 3, 4 and 5. Presentations with 
sophisticated special effects demonstrated an average 
score of 94.3 whereas the remaining presentations had 
an average score of 87.1 (p < 0.01) (Table 3). Presenta- 
tions which contained one or both of a pretest or post- 
test component scored an average of 92 whereas those 

Table 4 Comparison of scores based on reviewer scores 
from video review 

Parameter Talks with Average Talks with Average p value 
score of 5 score score less score 
than 5 



Two presentations had no Diagnostic Imaging images. 



Image 8 94.1 36 

quality 

score (1-5) 

46 talks with video, 44 of these had DI Images. 



87.5 



9.34 X 1 Oe-6 
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Table 5 Comparison of presentation scores based on a parameter value 


Parameter Average 


One SD Number of presentations Average score 
above mean one SD above mean 


Number of remaining 
presentations 


Average score 


p value 


Images per case 3.9 


5 8 83.4 


47 


88.8 


0.027 


Images per minute 3.4 


5.2 6 84.5 


49 


884 


0.16 


Slides per minute 2.69 


3.72 5 89.8 


52 


87.5 


0.47 


Cases per minute 1 


1.55 8 86.7 


47 


88.2 


0.54 



with neither scored an average of 87.8 (p = 0.004) 
(Table 3). Presentations with an image quahty score of 5 
received an average score of 94.1 compared to the rest 
of the presentations which were awarded an average 
score of 87.5 (p < 0.01) (Table 4). An interesting outcome 
was that the average score for presentations with images 
per case in excess of one standard deviation above the 
mean was 83.4 whereas for the remainder of the presen- 
tations was 88.8 (p = 0.027) (Table 5). 

Tables 6 and 7 summarizes the comparison of the high 
score group versus the low score group across a variety 
of parameters. Between the high score group and low 
score group, average image quality scores revealed a dif- 
ference of 4.4 vs. 3.6 (p = 0.006) while the proportion of 
talks with and without sophisticated special effects dem- 
onstrated a difference of 0.625 vs. 0 (p = 0.006). 

Discussion 

Consistent with previous reports [2], our results indicate 
that high quality images that were properly cropped, well 
projected, possessed suitable contrast and clearly dem- 
onstrated the relevant findings was most strongly 



Table 6 Comparison of scores between HSG and LSG 
based on video review 



Parameter 


HSG average 


LSG average 


p value 


mage quality score (1-5) 


4.375 


3.57 


0.0056 


Proportion containing 
sophisticated special effects 


0.525 


0 


0.0056 


Number of times audience 
laughed 


1.525 


0 


0.051 


Proportion containing 
pretest or posttest 


0.125 


0 


0175 


Proportion that 
finished on time 


0.875 


0.75 


0.28 


Number of summaries 


0.75 


1 


0.51 


Number of questions 
for audience 


1.75 


1.25 


0.75 


Proportion that were 
didactic lectures 
(with the rest being 
unknown type lectures) 


0.5 


0.5 


N/A 



All HSG scores include 8 presentations. All LSG scores include 8 presentations 
except Image quality score. Proportion containing Sophisticated Special Effects 
and Proportion containing any Special Effects which each contain 7 as one 
talk did not have Diagnostic Imaging images. 



associated with higher attendee evaluation scores. The 
second most strongly correlated variable with higher 
scores was the "number of times the audience laughed" 
suggesting that humor may be influential in achieving 
more positive feedback. This would support the role of 
entertainment in maintaining audience interest and at- 
tention to establish an environment conducive to learn- 
ing [1]. It is interesting to note that this metric only 
achieved near statistical significance when comparing 
lectures with and without audience laughter. 

It should be pointed out that the correlations in our 
study were not particularly strong; the strongest for 
image quality scores at r = 0.57 and the remainder less 
than 0.5, suggesting that determinants of a high quality 
Diagnostic Imaging lecture are likely multifactorial. 

The use of special effects was also strongly associated 
with higher scores. This would include the use of clear 
annotations pointing to the appropriate findings as well 
as effective use of animation including builds and transi- 
tioning between slides [2]. 

Using a pretest or posttest to interactively focus the 
audience's attention on salient points of a lecture was 
also found to be associated with a higher score. This em- 
phasizes the central role that audience interaction and 
participation plays in the dynamic learning process [1]. 
However, simply having informal questions posed to the 
audience was not associated with higher scores. 

It is interesting that many factors which might intui- 
tively be expected to be associated with higher evalu- 
ation scores were not confirmed in our study. This 
would include having stated objectives or summarizing 

Table 7 Comparison of scores between all HSG and LSG 



presentations 



Parameter 


HSG average 


LSG average 


p value 


Words per text slide 


274 


32 


0.39 


Slides per minute 


2.9 


2.5 


047 


Text lines per text slide 


5.3 


6.7 


0.61 


Images per case 


4 


4.3 


0.76 


Cases per minute 


1.01 


0.97 


0.85 


Images per minute 


341 


3.55 


0.86 


Proportion with 


0.556 


0.535 


0.88 



objectives or outline 

9 talks In HSG group and 1 1 talks in LSG group. 
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material during the lecture both of which have been felt 
to enhance speaker effectiveness [1]. Not finishing on 
time was previously documented to be a common 
source of negative audience feedback [3] although this 
was not confirmed to be statistically significant in our 
results. As well, it has long been thought that text slides 
with too many words per slide or too many text lines 
per slide were ineffective [2,3] although in our study this 
revealed only fairly weak negative correlation with audi- 
ence evaluation scores. 

Despite radiology being an image based specialty, a 
higher number of images were not associated with higher 
audience evaluation scores. In fact, having more images 
per case was the strongest negative correlation in our 
study although is not incompatible with previous reports 
[2]. This would suggest that image quality is much more 
important than image quantity. This might be explained 
by the fact that more images may result in a disorganized 
or rushed lecture perhaps with lower quality images which 
may not effectively convey the presenter's message. It has 
been suggested that images should not be overused or in- 
clude simply to impress the audience. Superfluous images 
can be avoided by eliminating anything that does not assist 
in attainment of the original lecture objectives [2]. 

An additional interesting finding is that the HSG and 
LSG (Tables 3 and 6) contain the same proportion of di- 
dactic vs. case-based lectures indicating that lecture style 
alone is not a determinant of success or failure. 

A weakness of our study includes the fact that not all 
presentations had video files that were available to be 
reviewed. The conference that we studied was targeted to 
a specific audience of resident review course attendees 
which may limit the applicability to other courses in radi- 
ology and other specialties. Similar studies at different 
CME conferences may help identify whether these pat- 
terns endure at courses with other themes. Additional lim- 
itations include varied number of audience evaluations 
(ranging from 20-50% of attendees), and the fact that a 
single individual extracted all of the objective data. 

The comments from attendees point to some possible 
areas for further research such as organization, pace of 
talk, volume of speaker and accuracy of slides. 

Conclusion 

This study identifies that there are many determinants 
of high quality Diagnostic Imaging review course lec- 
tures. The factors that most strongly contribute to lec- 
ture success are: high quality images; use of fewer 
images per case; use of special effects which clearly and 
precisely convey imaging findings or clarify difficult con- 
cepts; use of pretest/posttest tools and perhaps most im- 
portantly — a sprinkling of humor. These findings can 
assist in optimizing lecture preparation and guide fur- 
ther research. 
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